Create a new analysis directories.
- general directory
- stains directory
- for plots
- for output of summary results
- for baseline tables
* General packages...
Update all/some/none? [a/s/n]:
For the ERA-CVD ‘druggable-MI-targets’ project (grantnumber: 01KL1802) we performed two related RNA sequencing (RNAseq) experiments:
conventional (‘bulk’) RNAseq using RNA extracted from carotid plaque samples, n ± 700. As of Thursday, October 17, 2024 all samples have been selected and RNA has been extracted; quality control (QC) was performed and we have a dataset of 635 samples.
single-cell RNAseq (scRNAseq) of at least n = 40 samples (20 females, 20 males). As of Thursday, October 17, 2024 data is available of 40 samples (3 females, 15 males), we are extending sampling to get more female samples.
Plaque samples are derived from carotid endarterectomies as part of the Athero-Express Biobank Study which is an ongoing study in the UMC Utrecht.
Here we map the CONVOCALS_downstream to single-cells from the plaques.
Here we obtain data from the CONVOCALS_downstream in plaques.
library(openxlsx)
gene_list_df <- read.xlsx(paste0(PROJECT_loc, "/targets/targets.xlsx"), sheet = "Genes")
gene_list <- unlist(gene_list_df$Gene)
gene_list [1] "SCGB3A2" "LIX1" "IGSF9B" "ND6" "ALB" "OR10A4" "RASEF" "NEDD4" "TRIM49D1" "TCL1A" "ND4L" "ATP8" "FBXO15" "F5" "TMEM212"
[16] "PTPRD" "CYP46A1" "OR2T33" "SORBS2" "ITGA7" "RNASE1" "FOS" "HMOX1" "LAPTM5" "MMP9" "SIGLEC1" "FTL" "CD14" "HCST" "TIMP3"
[31] "CCL2" "SAT1" "CD163" "PTGDS" "LGALS9" "ACKR1" "NT5DC2" "TGFBI" "C1QC" "S100A9"
First we will load the data:
Here we load the latest dataset from our Athero-Express single-cell RNA experiment.
# load(paste0(AESCRNA_loc, "/20210811.46.patients.KP.RData"))
# scRNAseqData <- seuset
# rm(seuset)
#
# saveRDS(scRNAseqData, paste0(AESCRNA_loc, "/20210811.46.patients.KP.RDS"))
scRNAseqData <- readRDS(paste0(AESCRNA_loc, "/20210811.46.patients.KP.RDS"))
scRNAseqDataAn object of class Seurat
36147 features across 4948 samples within 2 assays
Active assay: RNA (20111 features, 0 variable features)
3 layers present: counts, data, scale.data
1 other assay present: SCT
2 dimensional reductions calculated: pca, umap
The naming/classification is based on a combination conventional markers. We do not claim to know the exact identity of each cell, rather we refer to cells as ‘KIT+ Mast cells”-like cells. Likewise we refer to the cell clusters as ’communities’ of cells that exhibit similar properties, i.e. similar defining markers (e.g. KIT).
We will rename the cell types to human readable names.
### change names for clarity
backup.scRNAseqData = scRNAseqData
# get the old names to change to new names
UMAPPlot(scRNAseqData, label = FALSE, pt.size = 1.25, label.size = 4, group.by = "ident") [1] CD3+ T Cells I CD3+ T Cells IV CD34+ Endothelial Cells I
[4] CD3+ T Cells V CD3+CD56+ NK Cells II CD3+ T Cells VI
[7] CD68+IL18+TLR4+TREM2+ Resident macrophages CD3+CD56+ NK Cells I ACTA2+ Smooth Muscle Cells
[10] CD3+ T Cells II FOXP3+ T Cells CD34+ Endothelial Cells II
[13] CD3+ T Cells III CD68+CD1C+ Dendritic Cells CD68+CASP1+IL1B+SELL+ Inflammatory macrophages
[16] CD79A+ Class-switched Memory B Cells CD68+ABCA1+OLR1+TREM2+ Foam Cells CD68+KIT+ Mast Cells
[19] CD68+CD4+ Monocytes CD79+ Plasma B Cells
20 Levels: CD3+ T Cells I CD3+ T Cells II CD3+ T Cells III CD3+ T Cells IV CD68+IL18+TLR4+TREM2+ Resident macrophages ACTA2+ Smooth Muscle Cells ... CD79+ Plasma B Cells
celltypes <- c("CD68+CD4+ Monocytes" = "CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ Resident macrophages" = "CD68+IL18+TLR4+TREM2+ MRes",
"CD68+CD1C+ Dendritic Cells" = "CD68+CD1C+ DC",
"CD68+CASP1+IL1B+SELL+ Inflammatory macrophages" = "CD68+CASP1+IL1B+SELL MInf",
"CD68+ABCA1+OLR1+TREM2+ Foam Cells" = "CD68+ABCA1+OLR1+TREM2+ FC",
# T-cells
"CD3+ T Cells I" = "CD3+ TC I",
"CD3+ T Cells II" = "CD3+ TC II",
"CD3+ T Cells III" = "CD3+ TC III",
"CD3+ T Cells IV" = "CD3+ TC IV",
"CD3+ T Cells V" = "CD3+ TC V",
"CD3+ T Cells VI" = "CD3+ TC VI",
"FOXP3+ T Cells" = "FOXP3+ TC",
# Endothelial cells
"CD34+ Endothelial Cells I" = "CD34+ EC I",
"CD34+ Endothelial Cells II" = "CD34+ EC II",
# SMC
"ACTA2+ Smooth Muscle Cells" = "ACTA2+ SMC",
# NK Cells
"CD3+CD56+ NK Cells I" = "CD3+CD56+ NK I",
"CD3+CD56+ NK Cells II" = "CD3+CD56+ NK II",
# Mast
"CD68+KIT+ Mast Cells" = "CD68+KIT+ MC",
"CD79A+ Class-switched Memory B Cells" = "CD79A+ BCmem",
"CD79+ Plasma B Cells" = "CD79+ BCplasma")
scRNAseqData <- Seurat::RenameIdents(object = scRNAseqData,
celltypes)UMAPPlot(scRNAseqData, label = TRUE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)Loading the Athero-Express clinical data.
AEDB.CEA <- readRDS(file = paste0(OUT_loc, "/", Today,".",PROJECTNAME,".AEDB.CEA.RDS"))
# AEDB.CEA <- readRDS(file = paste0(OUT_loc, "/20240531.",PROJECTNAME,".AEDB.CEA.RDS"))# Baseline table variables
basetable_vars = c("Hospital", "ORyear", "Artery_summary",
"Age", "Gender",
# "TC_finalCU", "LDL_finalCU", "HDL_finalCU", "TG_finalCU",
"TC_final", "LDL_final", "HDL_final", "TG_final",
# "hsCRP_plasma",
"systolic", "diastoli", "GFR_MDRD", "BMI",
"KDOQI", "BMI_WHO",
"SmokerStatus", "AlcoholUse",
"DiabetesStatus",
"Hypertension.selfreport", "Hypertension.selfreportdrug", "Hypertension.composite", "Hypertension.drugs",
"Med.anticoagulants", "Med.all.antiplatelet", "Med.Statin.LLD",
"Stroke_Dx", "sympt", "Symptoms.5G", "AsymptSympt", "AsymptSympt2G",
"Symptoms.Update2G", "Symptoms.Update3G", "indexsymptoms_latest_4g",
"restenos", "stenose",
"CAD_history", "PAOD", "Peripheral.interv",
"EP_composite", "EP_composite_time", "EP_major", "EP_major_time",
"MAC_rankNorm", "SMC_rankNorm", "Macrophages.bin", "SMC.bin",
"Neutrophils_rankNorm", "MastCells_rankNorm",
"IPH.bin", "VesselDensity_rankNorm",
"Calc.bin", "Collagen.bin",
"Fat.bin_10", "Fat.bin_40", "OverallPlaquePhenotype", "Plaque_Vulnerability_Index",
"PCSK9_plasma", "PCSK9_plasma_rankNorm")
basetable_bin = c("Gender", "Artery_summary",
"KDOQI", "BMI_WHO",
"SmokerStatus", "AlcoholUse",
"DiabetesStatus",
"Hypertension.selfreport", "Hypertension.selfreportdrug", "Hypertension.composite", "Hypertension.drugs",
"Med.anticoagulants", "Med.all.antiplatelet", "Med.Statin.LLD",
"Stroke_Dx", "sympt", "Symptoms.5G", "AsymptSympt", "AsymptSympt2G",
"Symptoms.Update2G", "Symptoms.Update3G", "indexsymptoms_latest_4g",
"restenos", "stenose",
"CAD_history", "PAOD", "Peripheral.interv",
"EP_major", "EP_composite", "Macrophages.bin", "SMC.bin",
"IPH.bin",
"Calc.bin", "Collagen.bin",
"Fat.bin_10", "Fat.bin_40", "OverallPlaquePhenotype", "Plaque_Vulnerability_Index")
# basetable_bin
basetable_con = basetable_vars[!basetable_vars %in% basetable_bin]
# basetable_conmetadata <- scRNAseqData@meta.data %>% as_tibble() %>% separate(orig.ident, c("Patient", NA))
scRNAseqDataMeta <- metadata %>% distinct(Patient, .keep_all = TRUE)distinct: removed 4,902 rows (99%), 46 rows remaining
scRNAseqDataMetaAE <- merge(scRNAseqDataMeta, AEDB.CEA, by.x = "Patient", by.y = "STUDY_NUMBER", sort = FALSE, all.x = TRUE)
dim(scRNAseqDataMetaAE)[1] 46 1231
# Replace missing data
# Ref: https://cran.r-project.org/web/packages/naniar/vignettes/replace-with-na.html
require(naniar)
na_strings <- c("NA", "N A", "N / A", "N/A", "N/ A",
"Not Available", "Not available",
"missing",
"-999", "-99",
"No data available/missing", "No data available/Missing")
# Then you write ~.x %in% na_strings - which reads as “does this value occur in the list of NA strings”.
scRNAseqDataMetaAE %>%
replace_with_na_all(condition = ~.x %in% na_strings)cat("====================================================================================================")====================================================================================================
SELECTION THE SHIZZLE
- sanity checking PRIOR to selection
library(data.table)
require(labelled)
ae.gender <- to_factor(scRNAseqDataMetaAE$Gender)
ae.hospital <- to_factor(scRNAseqDataMetaAE$Hospital)
table(ae.gender, ae.hospital, dnn = c("Sex", "Hospital"), useNA = "ifany") Hospital
Sex St. Antonius, Nieuwegein UMC Utrecht <NA>
female 0 18 0
male 0 26 0
<NA> 0 0 2
ae.artery <- to_factor(scRNAseqDataMetaAE$Artery_summary)
table(ae.artery, ae.gender, dnn = c("Sex", "Artery"), useNA = "ifany") Artery
Sex female male <NA>
No artery known (yet), no surgery (patient ill, died, exited study), re-numbered to AAA 0 0 0
carotid (left & right) 18 25 0
femoral/iliac (left, right or both sides) 0 0 0
other carotid arteries (common, external) 0 1 0
carotid bypass and injury (left, right or both sides) 0 0 0
aneurysmata (carotid & femoral) 0 0 0
aorta 0 0 0
other arteries (renal, popliteal, vertebral) 0 0 0
femoral bypass, angioseal and injury (left, right or both sides) 0 0 0
<NA> 0 0 2
ae.gender
ae.ic female male <NA>
missing 0 0 0
no, died 0 0 0
yes 9 14 0
yes, health treatment when possible 5 7 0
yes, no health treatment 2 2 0
yes, no health treatment, no commercial business 1 2 0
yes, no tissue, no commerical business 0 0 0
yes, no tissue, no questionnaires, no medical info, no commercial business 0 0 0
yes, no questionnaires, no health treatment, no commercial business 0 0 0
yes, no questionnaires, health treatment when possible 0 0 0
yes, no tissue, no questionnaires, no health treatment, no commerical business 0 0 0
yes, no health treatment, no medical info, no commercial business 0 0 0
yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business 0 0 0
yes, no questionnaires, no health treatment 0 0 0
yes, no tissue, no health treatment 0 0 0
yes, no tissue, no questionnaires 0 0 0
yes, no tissue, health treatment when possible 0 0 0
yes, no tissue 0 0 0
yes, no commerical business 1 1 0
yes, health treatment when possible, no commercial business 0 0 0
yes, no medical info, no commercial business 0 0 0
yes, no questionnaires 0 0 0
yes, no tissue, no questionnaires, no health treatment, no medical info 0 0 0
yes, no tissue, no questionnaires, no health treatment, no commercial business 0 0 0
yes, no medical info 0 0 0
yes, no questionnaires, no commercial business 0 0 0
yes, no questionnaires, no health treatment, no medical info 0 0 0
yes, no questionnaires, health treatment when possible, no commercial business 0 0 0
yes, no health treatment, no medical info 0 0 0
no, doesn't want to 0 0 0
no, unable to sign 0 0 0
no, no reaction 0 0 0
no, lost 0 0 0
no, too old 0 0 0
yes, no medical info, health treatment when possible 0 0 0
no (never asked for IC because there was no tissue) 0 0 0
yes, no medical info, no commercial business, health treatment when possible 0 0 0
no, endpoint 0 0 0
wil niets invullen, wel alles gebruiken 0 0 0
second informed concents: yes, no commercial business 0 0 0
nooit geincludeerd 0 0 0
yes, not outside EU 0 0 0
yes, no DNA 0 0 0
<NA> 0 0 2
rm(ae.gender, ae.hospital, ae.artery, ae.ic)
scRNAseqDataMetaAE.all <- subset(scRNAseqDataMetaAE,
(Artery_summary == "carotid (left & right)" | Artery_summary == "other carotid arteries (common, external)" ) & # we only want carotids
informedconsent != "missing" & # we are really strict in selecting based on 'informed consent'!
informedconsent != "no, died" &
informedconsent != "yes, no tissue, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no health treatment" &
informedconsent != "yes, no tissue, no questionnaires" &
informedconsent != "yes, no tissue, health treatment when possible" &
informedconsent != "yes, no tissue" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commercial business" &
informedconsent != "no, doesn't want to" &
informedconsent != "no, unable to sign" &
informedconsent != "no, no reaction" &
informedconsent != "no, lost" &
informedconsent != "no, too old" &
informedconsent != "yes, no medical info, health treatment when possible" &
informedconsent != "no (never asked for IC because there was no tissue)" &
informedconsent != "no, endpoint" &
informedconsent != "nooit geincludeerd" &
informedconsent != "yes, no health treatment, no commercial business" & # IMPORTANT: since we are sharing with a commercial party
informedconsent != "yes, no tissue, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no medical info, no commercial business" &
informedconsent != "yes, no questionnaires, no health treatment, no commercial business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commerical business" &
informedconsent != "yes, no health treatment, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business" &
informedconsent != "yes, no commerical business" &
informedconsent != "yes, health treatment when possible, no commercial business" &
informedconsent != "yes, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commercial business" &
informedconsent != "yes, no questionnaires, no commercial business" &
informedconsent != "yes, no questionnaires, health treatment when possible, no commercial business" &
informedconsent != "second informed concents: yes, no commercial business")
# scRNAseqDataMetaAE.all[1:10, 1:10]
dim(scRNAseqDataMetaAE.all)[1] 39 1231
Showing the baseline table for the scRNAseq data in 39 CEA patients with informed consent.
===========================================================================================
CREATE BASELINE TABLE
# Create baseline tables
# http://rstudio-pubs-static.s3.amazonaws.com/13321_da314633db924dc78986a850813a50d5.html
scRNAseqDataMetaAE.all.tableOne = print(CreateTableOne(vars = basetable_vars,
# factorVars = basetable_bin,
# strata = "Gender",
data = scRNAseqDataMetaAE.all, includeNA = TRUE),
nonnormal = c(),
quote = FALSE, showAllLevels = TRUE,
format = "p",
contDigits = 3)[,1:2]Warning: These variables only have NA/NaN: MAC_rankNorm SMC_rankNorm Neutrophils_rankNorm MastCells_rankNorm IPH.bin VesselDensity_rankNorm PCSK9_plasma PCSK9_plasma_rankNorm Dropped
level Overall
n 39
Hospital (%) St. Antonius, Nieuwegein 0.0
UMC Utrecht 100.0
ORyear (%) No data available/missing 0.0
2002 0.0
2003 0.0
2004 0.0
2005 0.0
2006 0.0
2007 0.0
2008 0.0
2009 0.0
2010 0.0
2011 0.0
2012 0.0
2013 0.0
2014 0.0
2015 0.0
2016 0.0
2017 0.0
2018 51.3
2019 35.9
2020 10.3
2021 2.6
2022 0.0
Artery_summary (%) No artery known (yet), no surgery (patient ill, died, exited study), re-numbered to AAA 0.0
carotid (left & right) 97.4
femoral/iliac (left, right or both sides) 0.0
other carotid arteries (common, external) 2.6
carotid bypass and injury (left, right or both sides) 0.0
aneurysmata (carotid & femoral) 0.0
aorta 0.0
other arteries (renal, popliteal, vertebral) 0.0
femoral bypass, angioseal and injury (left, right or both sides) 0.0
Age (mean (SD)) 72.077 (8.183)
Gender (%) female 41.0
male 59.0
TC_final (mean (SD)) 4.533 (1.252)
LDL_final (mean (SD)) 2.676 (1.013)
HDL_final (mean (SD)) 1.135 (0.229)
TG_final (mean (SD)) 1.927 (1.093)
systolic (mean (SD)) 150.842 (27.013)
diastoli (mean (SD)) 79.711 (16.432)
GFR_MDRD (mean (SD)) 79.036 (31.777)
BMI (mean (SD)) 26.332 (3.962)
KDOQI (%) No data available/missing 0.0
Normal kidney function 28.2
CKD 2 (Mild) 33.3
CKD 3 (Moderate) 28.2
CKD 4 (Severe) 0.0
CKD 5 (Failure) 0.0
<NA> 10.3
BMI_WHO (%) No data available/missing 0.0
Underweight 2.6
Normal 33.3
Overweight 38.5
Obese 17.9
<NA> 7.7
SmokerStatus (%) Current smoker 28.2
Ex-smoker 53.8
Never smoked 12.8
<NA> 5.1
AlcoholUse (%) No 38.5
Yes 53.8
<NA> 7.7
DiabetesStatus (%) Control (no Diabetes Dx/Med) 71.8
Diabetes 28.2
Hypertension.selfreport (%) No data available/missing 0.0
no 7.7
yes 87.2
<NA> 5.1
Hypertension.selfreportdrug (%) No data available/missing 0.0
no 7.7
yes 87.2
<NA> 5.1
Hypertension.composite (%) No data available/missing 0.0
no 7.7
yes 92.3
Hypertension.drugs (%) No data available/missing 0.0
no 10.3
yes 84.6
<NA> 5.1
Med.anticoagulants (%) No data available/missing 0.0
no 87.2
yes 5.1
<NA> 7.7
Med.all.antiplatelet (%) No data available/missing 0.0
no 20.5
yes 74.4
<NA> 5.1
Med.Statin.LLD (%) No data available/missing 0.0
no 20.5
yes 74.4
<NA> 5.1
Stroke_Dx (%) Missing 0.0
No stroke diagnosed 56.4
Stroke diagnosed 43.6
sympt (%) missing 0.0
Asymptomatic 15.4
TIA 17.9
minor stroke 25.6
Major stroke 10.3
Amaurosis fugax 15.4
Four vessel disease 0.0
Vertebrobasilary TIA 0.0
Retinal infarction 2.6
Symptomatic, but aspecific symtoms 2.6
Contralateral symptomatic occlusion 0.0
retinal infarction 2.6
armclaudication due to occlusion subclavian artery, CEA needed for bypass 0.0
retinal infarction + TIAs 0.0
Ocular ischemic syndrome 7.7
ischemisch glaucoom 0.0
subclavian steal syndrome 0.0
TGA 0.0
Symptoms.5G (%) Asymptomatic 15.4
Ocular 23.1
Other 2.6
Retinal infarction 5.1
Stroke 35.9
TIA 17.9
AsymptSympt (%) Asymptomatic 15.4
Ocular and others 30.8
Symptomatic 53.8
AsymptSympt2G (%) Asymptomatic 15.4
Symptomatic 84.6
Symptoms.Update2G (%) Asymptomatic 15.4
Symptomatic 84.6
Symptoms.Update3G (%) Asymptomatic 15.4
Symptomatic 84.6
Unclear 0.0
indexsymptoms_latest_4g (%) asymp 15.4
ocular 28.2
TIA 20.5
stroke 35.9
unclear 0.0
restenos (%) missing 0.0
de novo 100.0
restenosis 0.0
stenose bij angioseal na PTCA 0.0
stenose (%) missing 0.0
0-49% 2.6
50-70% 10.3
70-90% 46.2
90-99% 25.6
100% (Occlusion) 0.0
NA 0.0
50-99% 0.0
70-99% 15.4
99 0.0
CAD_history (%) Missing 0.0
No history CAD 79.5
History CAD 20.5
PAOD (%) missing/no data 0.0
no 84.6
yes 15.4
Peripheral.interv (%) no 76.9
yes 23.1
EP_composite (%) No data available. 0.0
No composite endpoints 82.1
Composite endpoints 12.8
<NA> 5.1
EP_composite_time (mean (SD)) 522.597 (3078.209)
EP_major (%) No data available. 0.0
No major events (endpoints) 87.2
Major events (endpoints) 7.7
<NA> 5.1
EP_major_time (mean (SD)) 522.681 (3078.194)
Macrophages.bin (%) no/minor 2.6
moderate/heavy 0.0
<NA> 97.4
SMC.bin (%) no/minor 0.0
moderate/heavy 2.6
<NA> 97.4
Calc.bin (%) no/minor 2.6
moderate/heavy 0.0
<NA> 97.4
Collagen.bin (%) no/minor 0.0
moderate/heavy 2.6
<NA> 97.4
Fat.bin_10 (%) <10% 0.0
>10% 2.6
<NA> 97.4
Fat.bin_40 (%) <40% 2.6
>40% 0.0
<NA> 97.4
OverallPlaquePhenotype (%) atheromatous 0.0
fibroatheromatous 2.6
fibrous 0.0
<NA> 97.4
Plaque_Vulnerability_Index (%) 0 97.4
1 2.6
2 0.0
3 0.0
4 0.0
5 0.0
Writing the baseline table to Excel format.
# Write basetable
require(openxlsx)
# write.xlsx(file = paste0(BASELINE_loc, "/",Today,".",PROJECTNAME,".AESCRNA.32pts.after_qc.IC_commercial.BaselineTable.xlsx"),
# format(as.data.frame(scRNAseqDataMetaAE.all.tableOne), digits = 5, scientific = FALSE),
# rowNames = TRUE, colNames = TRUE,
# sheetName = "AESCRNA", overwrite = TRUE)
write.xlsx(file = paste0(BASELINE_loc, "/",Today,".",PROJECTNAME,".AESCRNA.CEA.39pts.after_qc.IC_academic.BaselineTable.xlsx"),
format(as.data.frame(scRNAseqDataMetaAE.all.tableOne), digits = 5, scientific = FALSE),
rowNames = TRUE, colNames = TRUE,
sheetName = "AESCRNA_CEA", overwrite = TRUE)Here review the number of cells per sample, plate, and patients. And plot the ratio’s per sample and study number.
How many cells per type ...?
integer(0)
# cat("\n\nHow many cells per plate ...?")
# sort(table(scRNAseqData@meta.data$ID))
# cat("\n\nHow many cells per type per plate ...?")
# table(scRNAseqData@meta.data$SCT_snn_res.0.8, scRNAseqData@meta.data$ID)
cat("\n\nHow many cells per patient ...?")
How many cells per patient ...?
4530 4675 4440 4605 4653 4472 4458 4455 4476 4587 4496 4601 4502 4501 4571 4478 4448 4477 4452 4459 4520 4602 4489 4432 4495 4545 4558 4480 4447 4500 4513 4535 4676
3 4 6 7 20 22 35 54 59 60 70 70 73 75 76 77 80 84 92 94 96 96 97 99 102 106 107 112 114 116 123 130 135
4486 4470 4487 4546 4488 4521 4580 4491 4541 4450 4542 4453 4443
137 144 144 144 146 161 163 175 178 205 213 222 422
Visualizing these ratio's per study number and sample ...?
UMAPPlot(scRNAseqData, label = TRUE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)
ggsave(paste0(PLOT_loc, "/", Today, ".UMAP.png"), plot = last_plot())Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
# barplot(prop.table(x = table(scRNAseqData@active.ident, scRNAseqData@meta.data$Patient)),
# cex.axis = 1.0, cex.names = 0.5, las = 1,
# col = uithof_color, xlab = "study number", legend.text = FALSE, args.legend = list(x = "bottom"))
# dev.copy(pdf, paste0(QC_loc, "/", Today, ".cell_ratios_per_sample.pdf"))
# dev.off()
# barplot(prop.table(x = table(scRNAseqData@active.ident, scRNAseqData@meta.data$ID)),
# cex.axis = 1.0, cex.names = 0.5, las = 2,
# col = uithof_color, xlab = "sample ID", legend.text = FALSE, args.legend = list(x = "bottom"))
# dev.copy(pdf, paste0(QC_loc, "/", Today, ".cell_ratios_per_sample_per_plate.pdf"))
# dev.off()Let’s project known cellular markers.
UMAPPlot(scRNAseqData, label = FALSE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)
# macrophages
FeaturePlot(scRNAseqData, features = c("CD14", "CD68"), cols = c("#ECECEC", "#DB003F"))# FeaturePlot(scRNAseqData, features = c("CD8"), cols = c("#ECECEC", "#DB003F"))
# b-cells
FeaturePlot(scRNAseqData, features = c("CD79A"), cols = c("#ECECEC", "#DB003F"))We check whether the targets genes were sequenced using our method. In case some genes are not available in our data we could filter them here.
[1] "SCGB3A2" "LIX1" "IGSF9B" "ND6" "ALB" "OR10A4" "RASEF" "NEDD4" "TRIM49D1" "TCL1A" "ND4L" "ATP8" "FBXO15" "F5" "TMEM212"
[16] "PTPRD" "CYP46A1" "OR2T33" "SORBS2" "ITGA7" "RNASE1" "FOS" "HMOX1" "LAPTM5" "MMP9" "SIGLEC1" "FTL" "CD14" "HCST" "TIMP3"
[31] "CCL2" "SAT1" "CD163" "PTGDS" "LGALS9" "ACKR1" "NT5DC2" "TGFBI" "C1QC" "S100A9"
This is to filter the list from genes that are not in the data.
gene_list_rm <- c("ND6", "TRIM49D1", "ND4L", "ATP8", "RNASE1")
temp = target_genes[!target_genes %in% gene_list_rm]
target_genes_qc <- c(temp)
# gene_list_qc <- gene_list
#
# for debug
# gene_list_qc_replace <- c("MRTFA")
# target_genes_qc <- target_genes
target_genes_qc [1] "SCGB3A2" "LIX1" "IGSF9B" "ALB" "OR10A4" "RASEF" "NEDD4" "TCL1A" "FBXO15" "F5" "TMEM212" "PTPRD" "CYP46A1" "OR2T33"
[15] "SORBS2" "ITGA7" "FOS" "HMOX1" "LAPTM5" "MMP9" "SIGLEC1" "FTL" "CD14" "HCST" "TIMP3" "CCL2" "SAT1" "CD163"
[29] "PTGDS" "LGALS9" "ACKR1" "NT5DC2" "TGFBI" "C1QC" "S100A9"
library(RColorBrewer)
p1 <- DotPlot(scRNAseqData, features = target_genes_qc,
cols = "RdBu")
p1 + theme(axis.text.x = element_text(angle = 45, hjust=1, size = 10))
ggsave(paste0(PLOT_loc, "/", Today, ".DotPlot.Targets.png"), plot = last_plot())ggsave(paste0(PLOT_loc, "/", Today, ".DotPlot.Targets.ps"), plot = last_plot())
ggsave(paste0(PLOT_loc, "/", Today, ".DotPlot.Targets.pdf"), plot = last_plot())
rm(p1)
# FeaturePlot(scRNAseqData, features = c(target_genes_qc),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE)
#
# ggsave(paste0(PLOT_loc, "/", Today, ".FeaturePlot.Targets.png"), plot = last_plot())
# ggsave(paste0(PLOT_loc, "/", Today, ".FeaturePlot.Targets.ps"), plot = last_plot())Let’s group this dotplot based on up- and down-regulation. First, we filter the target list for up- and down-regulated genes.
# Assuming 'gene_info' is your data.frame with the relevant columns loaded
gene_list_df_filtered <- subset(gene_list_df, Comments == "UP" | Comments == "DOWN")# Separate genes by significance
up_genes <- gene_list_df_filtered$GeneSymbol[gene_list_df_filtered$significance == "UP"]
down_genes <- gene_list_df_filtered$GeneSymbol[gene_list_df_filtered$significance == "DOWN"]
# Combine the genes, first DOWN then UP (or vice versa based on preference)
ordered_genes_qc <- c(down_genes, up_genes)library(RColorBrewer)
# Create DotPlot grouped by 'significance_group'
p1 <- DotPlot(scRNAseqData,
features = ordered_genes_qc,
cols = "RdBu")Warning: The following requested variables were not found: ND6, TRIM49D1, ND4L, ATP8, RNASE1
p1 + theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 10))
# Save the plots
ggsave(paste0(PLOT_loc, "/", Today, ".DotPlot.Targets.SignificanceGrouped.png"), plot = p1)Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
# VlnPlot(scRNAseqData, features = "DUSP27")
# VlnPlot files
ifelse(!dir.exists(file.path(PLOT_loc, "/VlnPlot")),
dir.create(file.path(PLOT_loc, "/VlnPlot")),
FALSE)[1] TRUE
VlnPlot_loc = paste0(PLOT_loc, "/VlnPlot")
for (GENE in target_genes_qc){
print(paste0("Projecting the expression of ", GENE, "."))
vp1 <- VlnPlot(scRNAseqData, features = GENE) +
xlab("cell communities") +
ylab(bquote("normalized expression")) +
theme(axis.title.x = element_text(color = "#000000", size = 14, face = "bold"),
axis.title.y = element_text(color = "#000000", size = 14, face = "bold"),
legend.position = "none")
ggsave(paste0(VlnPlot_loc, "/", Today, ".VlnPlot.",GENE,".png"), plot = last_plot())
ggsave(paste0(VlnPlot_loc, "/", Today, ".VlnPlot.",GENE,".ps"), plot = last_plot())
ggsave(paste0(VlnPlot_loc, "/", Today, ".VlnPlot.",GENE,".pdf"), plot = last_plot())
# print(vp1)
}[1] "Projecting the expression of SCGB3A2."
Saving 7 x 7 in image
[1] "Projecting the expression of LIX1."
[1] "Projecting the expression of IGSF9B."
[1] "Projecting the expression of ALB."
[1] "Projecting the expression of OR10A4."
[1] "Projecting the expression of RASEF."
[1] "Projecting the expression of NEDD4."
[1] "Projecting the expression of TCL1A."
[1] "Projecting the expression of FBXO15."
[1] "Projecting the expression of F5."
[1] "Projecting the expression of TMEM212."
[1] "Projecting the expression of PTPRD."
[1] "Projecting the expression of CYP46A1."
[1] "Projecting the expression of OR2T33."
[1] "Projecting the expression of SORBS2."
[1] "Projecting the expression of ITGA7."
[1] "Projecting the expression of FOS."
[1] "Projecting the expression of HMOX1."
[1] "Projecting the expression of LAPTM5."
[1] "Projecting the expression of MMP9."
[1] "Projecting the expression of SIGLEC1."
[1] "Projecting the expression of FTL."
[1] "Projecting the expression of CD14."
[1] "Projecting the expression of HCST."
[1] "Projecting the expression of TIMP3."
[1] "Projecting the expression of CCL2."
[1] "Projecting the expression of SAT1."
[1] "Projecting the expression of CD163."
[1] "Projecting the expression of PTGDS."
[1] "Projecting the expression of LGALS9."
[1] "Projecting the expression of ACKR1."
[1] "Projecting the expression of NT5DC2."
[1] "Projecting the expression of TGFBI."
[1] "Projecting the expression of C1QC."
[1] "Projecting the expression of S100A9."
Here we project genes to only the broad cell communities:
[1] CD3+ TC I CD3+ TC IV CD34+ EC I CD3+ TC V CD3+CD56+ NK II
[6] CD3+ TC VI CD68+IL18+TLR4+TREM2+ MRes CD3+CD56+ NK I ACTA2+ SMC CD3+ TC II
[11] FOXP3+ TC CD34+ EC II CD3+ TC III CD68+CD1C+ DC CD68+CASP1+IL1B+SELL MInf
[16] CD79A+ BCmem CD68+ABCA1+OLR1+TREM2+ FC CD68+KIT+ MC CD68+CD4+ Mono CD79+ BCplasma
20 Levels: CD68+CD4+ Mono CD68+IL18+TLR4+TREM2+ MRes CD68+CD1C+ DC CD68+CASP1+IL1B+SELL MInf CD68+ABCA1+OLR1+TREM2+ FC CD3+ TC I ... CD79+ BCplasma
Comparison between the macrophages cell communities (CD14/CD68+), and all other communities.
MAC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC"),
ident.2 = c(#"CD68+CASP1+IL1B+SELL MInf",
#"CD68+CD1C+ DC",
#"CD68+CD4+ Mono",
#"CD68+IL18+TLR4+TREM2+ MRes",
#"CD68+ABCA1+OLR1+TREM2+ FC",
"CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC",
"CD34+ EC I",
"CD34+ EC II",
"ACTA2+ SMC",
"CD3+CD56+ NK I",
"CD3+CD56+ NK II",
"CD68+KIT+ MC",
"CD79+ BCplasma",
"CD79A+ BCmem"))
DT::datatable(MAC.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
MAC_Volcano_TargetsA = EnhancedVolcano(MAC.markers,
lab = rownames(MAC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "Macrophage markers\n(Macrophage communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/(nrow(MAC.markers)), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
MAC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.MAC.DEG.Targets.pdf"),
plot = MAC_Volcano_TargetsA)The target results are given below and written to a file.
Comparison between the smooth muscle cell communities (ACTA2+), and all other communities.
SMC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("ACTA2+ SMC"),
ident.2 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC",
"CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC",
"CD34+ EC I",
"CD34+ EC II",
#"ACTA2+ SMC",
"CD3+CD56+ NK I",
"CD3+CD56+ NK II",
"CD68+KIT+ MC",
"CD79+ BCplasma",
"CD79A+ BCmem"))
DT::datatable(SMC.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
SMC_Volcano_TargetsA = EnhancedVolcano(SMC.markers,
lab = rownames(SMC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "SMC markers\n(SMC communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/(nrow(SMC.markers)), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
SMC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.SMC.DEG.Targets.pdf"),
plot = SMC_Volcano_TargetsA)The target results are given below and written to a file.
Comparison between the endothelial cell communities (CD34+), and all other communities.
EC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD34+ EC I",
"CD34+ EC II"),
ident.2 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC",
"CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC",
# "CD34+ EC I",
# "CD34+ EC II",
"ACTA2+ SMC",
"CD3+CD56+ NK I",
"CD3+CD56+ NK II",
"CD68+KIT+ MC",
"CD79+ BCplasma",
"CD79A+ BCmem"))
DT::datatable(EC.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
EC_Volcano_TargetsA = EnhancedVolcano(EC.markers,
lab = rownames(EC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "Endothelial cell markers\n(EC communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/(nrow(EC.markers)), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
EC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.EC.DEG.Targets.pdf"),
plot = EC_Volcano_TargetsA)The target results are given below and written to a file.
Comparison between the T-cell communities (CD3/CD4/CD8+), and all other communities.
TC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC"),
ident.2 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC",
# "CD3+ TC I",
# "CD3+ TC II",
# "CD3+ TC III",
# "CD3+ TC IV",
# "CD3+ TC V",
# "CD3+ TC VI",
# "FOXP3+ TC",
"CD34+ EC I",
"CD34+ EC II",
"ACTA2+ SMC",
"CD3+CD56+ NK I",
"CD3+CD56+ NK II",
"CD68+KIT+ MC",
"CD79+ BCplasma",
"CD79A+ BCmem"))
DT::datatable(TC.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
TC_Volcano_TargetsA = EnhancedVolcano(TC.markers,
lab = rownames(TC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "T-cell markers\n(T-cell communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/nrow(TC.markers), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
TC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.TC.DEG.Targets.pdf"),
plot = TC_Volcano_TargetsA)The target results are given below and written to a file.
Comparison between the B-cell communities (CD79A+), and all other communities.
BC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD79+ BCplasma",
"CD79A+ BCmem"),
ident.2 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC",
"CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC",
"CD34+ EC I",
"CD34+ EC II",
"ACTA2+ SMC",
"CD3+CD56+ NK I",
"CD3+CD56+ NK II",
"CD68+KIT+ MC"
# "CD79+ BCplasma",
# "CD79A+ BCmem"
))
DT::datatable(BC.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
BC_Volcano_TargetsA = EnhancedVolcano(BC.markers,
lab = rownames(BC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "B-cell markers\n(B-cell communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/nrow(BC.markers), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
BC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.BC.DEG.Targets.pdf"),
plot = BC_Volcano_TargetsA)The target results are given below and written to a file.
Comparison between the mast cell communities (KIT+), and all other communities.
MC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD68+KIT+ MC"),
ident.2 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC",
"CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC",
"CD34+ EC I",
"CD34+ EC II",
"ACTA2+ SMC",
"CD3+CD56+ NK I",
"CD3+CD56+ NK II",
# "CD68+KIT+ MC",
"CD79+ BCplasma",
"CD79A+ BCmem"))
DT::datatable(MC.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
MC_Volcano_TargetsA = EnhancedVolcano(MC.markers,
lab = rownames(MC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "Mast cell markers\n(Mast cell communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/nrow(MC.markers), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
MC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.MC.DEG.Targets.pdf"),
plot = MC_Volcano_TargetsA)The target results are given below and written to a file.
Comparison between the natural killer cell communities (NCAM1+), and all other communities.
NK.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD3+CD56+ NK I",
"CD3+CD56+ NK II"),
ident.2 = c("CD68+CASP1+IL1B+SELL MInf",
"CD68+CD1C+ DC",
"CD68+CD4+ Mono",
"CD68+IL18+TLR4+TREM2+ MRes",
"CD68+ABCA1+OLR1+TREM2+ FC",
"CD3+ TC I",
"CD3+ TC II",
"CD3+ TC III",
"CD3+ TC IV",
"CD3+ TC V",
"CD3+ TC VI",
"FOXP3+ TC",
"CD34+ EC I",
"CD34+ EC II",
"ACTA2+ SMC",
# "CD3+CD56+ NK I",
# "CD3+CD56+ NK II",
"CD68+KIT+ MC",
"CD79+ BCplasma",
"CD79A+ BCmem"))
DT::datatable(NK.markers)Warning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.htmlWarning: It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html
NK_Volcano_TargetsA = EnhancedVolcano(NK.markers,
lab = rownames(NK.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = target_genes_qc,
axisLabSize = 12,
xlab = "average fold-change",
title = "NK markers\n(NK-cell communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/nrow(NK.markers), # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels =c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
NK_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.NK.DEG.Targets.pdf"),
plot = NK_Volcano_TargetsA)The target results are given below and written to a file.
List of samples to be included based on informed consent (see above).
variables_of_interest <- c("Hospital", "ORyear", "Artery_summary",
"Age", "Gender",
"TC_final", "LDL_final", "HDL_final", "TG_final",
"systolic", "diastoli", "GFR_MDRD", "BMI",
"KDOQI", "BMI_WHO",
"SmokerStatus", "AlcoholUse",
"DiabetesStatus",
"Hypertension.selfreport", "Hypertension.selfreportdrug", "Hypertension.composite", "Hypertension.drugs",
"Med.anticoagulants", "Med.all.antiplatelet", "Med.Statin.LLD",
"Stroke_Dx",
"sympt", "Symptoms.5G", "AsymptSympt", "AsymptSympt2G",
"Symptoms.Update2G", "Symptoms.Update3G", "indexsymptoms_latest_4g",
"restenos", "stenose",
"CAD_history", "PAOD", "Peripheral.interv",
"EP_composite", "EP_composite_time", "EP_major", "EP_major_time")
temp <- subset(scRNAseqDataMetaAE.all, select = c("Patient", variables_of_interest))
# str(temp)scRNAseqDataCEA39@meta.data <- merge(scRNAseqDataCEA39@meta.data, temp, by.x = "Patient", by.y = "Patient")
scRNAseqDataCEA39@meta.data <- dplyr::rename(scRNAseqDataCEA39@meta.data, "STUDY_NUMBER" = "Patient")
# str(scRNAseqDataCEA39@meta.data)temp2 <- as_tibble(subset(scRNAseqDataCEA39@meta.data, select = c("STUDY_NUMBER", "orig.ident", "nCount_RNA", "nFeature_RNA",
"Plate", "Batch", "C.H", "Type", "percent.mt",
"nCount_SCT", "nFeature_SCT", "seurat_clusters")))
# fwrite(temp2,
# file = paste0(OUT_loc, "/", Today, ".AESCRNA.CEA.39pts.samplelist.after_qc.IC_commercial.csv"),
# sep = ",", row.names = FALSE, col.names = TRUE,
# showProgress = TRUE)
# rm(temp2)
#
# temp <- dplyr::rename(temp, "STUDY_NUMBER" = "Patient")
# fwrite(temp,
# file = paste0(OUT_loc, "/", Today, ".AESCRNA.CEA.39pts.clinicaldata.after_qc.IC_commercial.csv"),
# sep = ",", row.names = FALSE, col.names = TRUE,
# showProgress = TRUE)
# rm(temp)
#
# saveRDS(scRNAseqDataCEA39, file = paste0(OUT_loc, "/", Today, ".AESCRNA.CEA.39pts.Seurat.after_qc.IC_commercial.RDS"))
fwrite(temp2,
file = paste0(OUT_loc, "/", Today, ".AESCRNA.CEA.39pts.samplelist.after_qc.IC_academic.csv"),
sep = ",", row.names = FALSE, col.names = TRUE,
showProgress = TRUE)
rm(temp2)
temp <- dplyr::rename(temp, "STUDY_NUMBER" = "Patient")
fwrite(temp,
file = paste0(OUT_loc, "/", Today, ".AESCRNA.CEA.39pts.clinicaldata.after_qc.IC_academic.csv"),
sep = ",", row.names = FALSE, col.names = TRUE,
showProgress = TRUE)
rm(temp)
saveRDS(scRNAseqDataCEA39, file = paste0(OUT_loc, "/", Today, ".AESCRNA.CEA.39pts.Seurat.after_qc.IC_academic.RDS"))Version: v1.1.2
Last update: 2024-01-09
Written by: Sander W. van der Laan (s.w.vanderlaan-2[at]umcutrecht.nl).
Description: Script to load single-cell RNA sequencing (scRNAseq) data, and perform quality control (QC), and initial mapping to cells.
Minimum requirements: R version 3.5.2 (2018-12-20) -- 'Eggshell Igloo', macOS Mojave (10.14.2).
**MoSCoW To-Do List**
The things we Must, Should, Could, and Would have given the time we have.
_M_
_S_
_C_
_W_
**Changes log**
* v1.1.1 Textual fixes.
* v1.1.1 Fix writing baseline table.
* v1.1.0 Update to study database.
* v1.0.2 Fixes to the start of the notebook. Update to loading of the clinical data. Fix on the gene-filtering.
* v1.0.1 Update to main AEDB (there is an error in the Age-variable in the new version). Fewer patients in scRNAseq (32 vs 39 with the newer dataset).
* v1.0.0 Initial version.
R version 4.4.1 (2024-06-14)
Platform: x86_64-apple-darwin20
Running under: macOS 15.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats4 grid tools stats graphics grDevices utils datasets methods base
other attached packages:
[1] RColorBrewer_1.1-3 Seurat_5.1.0 SeuratObject_5.0.2
[4] sp_2.1-4 BiocManager_1.30.25 magrittr_2.0.3
[7] rmarkdown_2.28 annotables_0.2.0 EnhancedVolcano_1.22.0
[10] ggrepel_0.9.6 EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.28.1
[13] AnnotationFilter_1.28.0 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 mygene_1.40.0
[16] txdbmaker_1.0.1 org.Hs.eg.db_3.19.1 DESeq2_1.44.0
[19] SummarizedExperiment_1.34.0 MatrixGenerics_1.16.0 matrixStats_1.4.1
[22] GenomicFeatures_1.56.0 GenomicRanges_1.56.2 AnnotationDbi_1.66.0
[25] Biobase_2.64.0 GenomeInfoDb_1.40.1 IRanges_2.38.1
[28] S4Vectors_0.42.1 BiocGenerics_0.50.0 Hmisc_5.1-3
[31] survminer_0.4.9 survival_3.7-0 GGally_2.2.1
[34] PerformanceAnalytics_2.0.4 xts_0.14.0 zoo_1.8-12
[37] ggcorrplot_0.1.4.999 corrr_0.4.4 reshape2_1.4.4
[40] meta_7.0-0 metadat_1.2-0 qqman_0.1.9
[43] sjlabelled_1.2.0 patchwork_1.3.0.9000 labelled_2.13.0
[46] sjPlot_2.8.16 UpSetR_1.4.0 ggpubr_0.6.0.999
[49] forestplot_3.1.5 abind_1.4-8 checkmate_2.3.2
[52] pheatmap_1.0.12 devtools_2.4.5 usethis_3.0.0
[55] BlandAltmanLeh_0.3.1 tableone_0.13.2 openxlsx_4.2.7.1
[58] haven_2.5.4 eeptools_1.2.5 DT_0.33
[61] knitr_1.48 lubridate_1.9.3 forcats_1.0.0
[64] stringr_1.5.1 purrr_1.0.2 tibble_3.2.1
[67] ggplot2_3.5.1 tidyverse_2.0.0 data.table_1.16.2
[70] naniar_1.1.0 tidylog_1.1.0 tidyr_1.3.1
[73] dplyr_1.1.4 optparse_1.7.5 readr_2.1.5
[76] pander_0.6.5 R.utils_2.12.3 R.oo_1.26.0
[79] R.methodsS3_1.8.2 worcs_0.1.15 credentials_2.0.2
loaded via a namespace (and not attached):
[1] igraph_2.0.3 ica_1.0-3 plotly_4.10.4 Formula_1.2-5 zlibbioc_1.50.0
[6] gert_2.1.4 tidyselect_1.2.1 bit_4.5.0 lattice_0.22-6 rjson_0.2.23
[11] blob_1.2.4 urlchecker_1.0.1 S4Arrays_1.4.1 parallel_4.4.1 png_0.1-8
[16] tinytex_0.53 cli_3.6.3 ProtGenerics_1.36.0 askpass_1.2.1 sjstats_0.19.0
[21] goftest_1.2-3 openssl_2.2.2 textshaping_0.4.0 BiocIO_1.14.0 uwot_0.2.2
[26] curl_5.2.3 mime_0.12 evaluate_1.0.1 leiden_0.4.3.1 gsubfn_0.7
[31] stringi_1.8.4 backports_1.5.0 XML_3.99-0.17 httpuv_1.6.15 rappdirs_0.3.3
[36] splines_4.4.1 getopt_1.20.4 KMsurv_0.1-5 ggbeeswarm_0.7.2 sctransform_0.4.1
[41] sessioninfo_1.2.2 DBI_1.2.3 jquerylib_0.1.4 withr_3.0.1 class_7.3-22
[46] systemfonts_1.1.0 lmtest_0.9-40 rtracklayer_1.64.0 htmlwidgets_1.6.4 fs_1.6.4
[51] biomaRt_2.60.1 labeling_0.4.3 gh_1.4.1 SparseArray_1.4.8 ranger_0.16.0
[56] reticulate_1.39.0 XVector_0.44.0 UCSC.utils_1.0.0 timechange_0.3.0 fansi_1.0.6
[61] calibrate_1.7.7 RSpectra_0.16-2 irlba_2.3.5.1 ggrastr_1.0.2 fastDummies_1.7.4
[66] ellipsis_0.3.2 lazyeval_0.2.2 yaml_2.3.10 scattermore_1.2 crayon_1.5.3
[71] RcppAnnoy_0.0.22 progressr_0.14.0 later_1.3.2 ggridges_0.5.6 codetools_0.2-20
[76] base64enc_0.1-3 profvis_0.4.0 KEGGREST_1.44.1 Rtsne_0.17 limma_3.60.6
[81] Rsamtools_2.20.0 filelock_1.0.3 rticles_0.27 foreign_0.8-87 sqldf_0.4-11
[86] pkgconfig_2.0.3 xml2_1.3.6 spatstat.univar_3.0-1 mathjaxr_1.6-0 GenomicAlignments_1.40.0
[91] spatstat.sparse_3.1-0 viridisLite_0.4.2 performance_0.12.3 xtable_1.8-4 car_3.1-3
[96] plyr_1.8.9 httr_1.4.7 globals_0.16.3 sys_3.4.3 pkgbuild_1.4.4
[101] beeswarm_0.4.0 htmlTable_2.4.3 broom_1.0.7 nlme_3.1-166 dbplyr_2.5.0
[106] survMisc_0.5.6 crosstalk_1.2.1 ggeffects_1.7.2 lme4_1.1-35.5 digest_0.6.37
[111] numDeriv_2016.8-1.1 Matrix_1.7-0 farver_2.1.2 tzdb_0.4.0 rpart_4.1.23
[116] glue_1.8.0 cachem_1.1.0 BiocFileCache_2.12.0 polyclip_1.10-7 generics_0.1.3
[121] Biostrings_2.72.1 visdat_0.6.0 CompQuadForm_1.4.3 presto_1.0.0 proto_1.0.0
[126] survey_4.4-2 parallelly_1.38.0 statmod_1.5.0 pkgload_1.4.0 arm_1.14-4
[131] RcppHNSW_0.6.0 ragg_1.3.3 carData_3.0-5 minqa_1.2.8 pbapply_1.7-2
[136] httr2_1.0.5 spam_2.11-0 utf8_1.2.4 mitools_2.4 sjmisc_2.8.10
[141] datawizard_0.13.0 ggsignif_0.6.4 gridExtra_2.3 shiny_1.9.1 GenomeInfoDbData_1.2.12
[146] clisymbols_1.2.0 RCurl_1.98-1.16 memoise_2.0.1 scales_1.3.0 future_1.34.0
[151] RANN_2.6.2 renv_1.0.11 km.ci_0.5-6 spatstat.data_3.1-2 rstudioapi_0.16.0
[156] cluster_2.1.6 spatstat.utils_3.1-0 hms_1.1.3 fitdistrplus_1.2-1 munsell_0.5.1
[161] cowplot_1.1.3 colorspace_2.1-1 rlang_1.1.4 quadprog_1.5-8 dotCall64_1.2
[166] xfun_0.48 prereg_0.6.0 coda_0.19-4.1 e1071_1.7-16 metafor_4.6-0
[171] remotes_2.5.0 ggsci_3.2.0 bitops_1.0-9 promises_1.3.0 RSQLite_2.3.7
[176] DelayedArray_0.30.1 proxy_0.4-27 compiler_4.4.1 prettyunits_1.2.0 boot_1.3-31
[181] listenv_0.9.1 Rcpp_1.0.13 tensor_1.5 MASS_7.3-61 progress_1.2.3
[186] BiocParallel_1.38.0 insight_0.20.5 spatstat.random_3.3-2 R6_2.5.1 fastmap_1.2.0
[191] rstatix_0.7.2 vipor_0.4.7 ROCR_1.0-11 ggstats_0.7.0 vcd_1.4-13
[196] nnet_7.3-19 gtable_0.3.5 KernSmooth_2.23-24 miniUI_0.1.1.1 deldir_2.0-4
[201] htmltools_0.5.8.1 bit64_4.5.2 spatstat.explore_3.3-2 lifecycle_1.0.4 zip_2.3.1
[206] nloptr_2.1.1 restfulr_0.0.15 sass_0.4.9 vctrs_0.6.5 spatstat.geom_3.3-3
[211] future.apply_1.11.2 bslib_0.8.0 pillar_1.9.0 locfit_1.5-9.10 jsonlite_1.8.9
[216] chron_2.3-61
rm(backup.scRNAseqData)
rm(scRNAseqData, scRNAseqDataCEA39)
save.image(paste0(PROJECT_loc, "/",Today,".",PROJECTNAME,".AESCRNA.results.RData"))| © 1979-2024 Sander W. van der Laan | s.w.vanderlaan[at]gmail[dot]com | vanderlaanand.science. |